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^ \ The measurement of distance between two objects is generalized to the case where the 

(N 

. ' objects are no longer points but are one-dimensional. Additional concepts such as non- 

extensibility, curvature constraints, and non-crossing become central to the notion of dis- 
^ \ tance. Analytical and numerical results are given for some specific examples, and applica- 

tions to biopolymers are discussed. 

o 

: I. INTRODUCTION 

S: 

' The distance, as conventionally defined between two zero-dimensional objects (points) A and B at 
O , positions and r^, is the minimal arclength travelled in the transformation from A to 5. A transformation 
r(f ) between A and 5 is a vector function which may be parametrized by a scalar variable t: < t < T , 
^ ■ r(0) = Ta, r(r) = Tb, and the distance travelled is a functional of r(t). The (minimal) transformation 
O . r* (?) is an object of dimension one higher than A or B, i.e. it yields a distance that is one-dimensional. 
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The distance V* is found through the variation of the functional [1]: 

V* = D [r*(f)] where r*(0 satisfies (la) 

6 fdt (g^,#(Oi'^(0)'^' = 0. (lb) 
Jo 

or 5 I dtVi^ = (Euclidean metric) (Ic) 
Jo 

Here x = dx/dt, and r = dr/dt. The boundary conditions mentioned above are present at the end points 
of the integral. The Einstein summation convention will be used where convenient, e.g. eq. (lb), however 
all the analysis here deals with spatial coordinates, u = 1,2,3 on a Euclidean metric. Generalizations to 
dimension higher than 3, as well as non-Euclidean metrics, are straightforward to incorporate into the 
formalism. 

On a Euclidean metric, ^^j, = S^u and the minimal distance becomes the diagonal of a hypercube. 
However, formulated as above, the solutions minimizing V are infinitely degenerate, because particles 
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moving at various speeds but tracing the same trajectory over the total time T all give the same distance. 
To circumvent this problem what is typically done is to let one of the space variables (e.g ;c) become 
the independent variable. However for higher dimensional objects, or zero dimensional objects on a 
manifold with nontrivial topology, there is no guarantee that the dependent variables (y, z) constitute 
single valued functions of x. Alternatively, one can study the 'time' trajectory of the parametric curve 
defined above, but under a gauge that fixes the speed to a constant v„, for example. One can either fix the 
gauge from the outset with Lagrange multipliers, or choose a gauge that may simplify the problem after 
finding the extremum equations. The latter is often simpler in practice. 

To be specific, the effective Lagrangian C appearing in the above problem is v^, and the Euler- 
Lagrange (EL) equations are 

with V the unit vector in the direction of the velocity. The boundary conditions are 

r*(0)=rA and r*{T)=r^. (3) 

Since the derivative of a unit vector is always orthogonal to that vector, equation (2) says that the 
direction of the velocity cannot change, and therefore straight line motion results. Applying the boundary 
conditions gives \ — {v-^ — Ta) / — Ta]. However, any function \{t) — \vo{t) \ v satisfying the boundary 
conditions is a solution, so long as j^dt \vo{t) \ — Ir,, — Ta]. This is the infinite degeneracy of solutions 
mentioned above. Then r*{t) = Ta + j'^dt \vo{t)\, and V* = J^dt = f^Jdt \vo{t) | = Ire - Ta]. 
At this point we could fix the parameterization by choosing \vo{t) | = |rB — v^l/T (constant speed), for 
example. 

The extremum is a minimum, as can be shown by analyzing the eigenvalues of the matrix 
d^T>/dx„{t)dXfj,{t') — —S^^6"{t — t'). Diagonalizing by Fourier transform gives positive elements 
+u;l 5^i,5{ujn — oj'n) for the stability matrix and thus positive eigenvalues. 

In what follows we generalize the notion of distance to higher dimensional objects, specifically space- 
curves. We will see many of the above themes reiterated, as well as some fundamentally new features that 
emerge when one treats the space curves as non-extensible, having some persistence length or curvature 
constraint, and non-crossing or unable to pass through themselves. We provide analytical and numerical 
results for some prototypical examples for non-extensible chains, and we lay the foundations for treating 
curvature and non-crossing constraints. 
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II. DISTANCE METRIC FOR ONE DIMENSIONAL OBJECTS 



The distance V* between two one-dimensional objects (which we refer to as space curves or strings) 
A and B having configurations rj^{s) and rB(5), < * < L, is obtained from the transformation from 
A to 5 that minimizes the integrated distance travelled. By integrated distance we mean the cumulative 
arclength all elements of the string had to move in the transformation from A to B. For the transformation 
to exist, strings A and B must have the same length (although this condition may be relaxed by allowing 
specific extensions or contractions). For the distance to be finite, open space curves must be finite in 
length. For closed non-crossing space curves, A and B must be in the same topological class for the 
transformation to exist. Describing the transformation r{t,s) requires two scalar parameters, one for arc 
length s along the string and another measuring progress as in the zero-dimensional case, say t:0<t <T, 
so that r(s,0) = Ya{s) and r(s, T) — r^{s). The distance travelled is a functional of the vector function 
r(s, t) . The minimal transformation r* {t, s) is an object of dimension one higher than A or B, i.e. it yields 
a distance that is two-dimensional. The problem does not map to a simple soap film, since there are 
many configuration pairs that have zero area between them but nonzero distance travelled, e.g. a straight 
line displaced along its own axis, or that in figure IC. The analogue to a higher-dimensional surface of 
minimal area when the 'time' t is included is closer but inexact (see footnote below). 

We can construct the effective Lagrangian along the same lines as the zero-dimensional case. Using 
the shorthand r = r{s,t), r = dr/dt, r' = dr/ds, the distance travelled is^ ^ 



However to meaningfully represent the distance a string must move to reconfigure itself from confor- 



The distance-metric action in eq. (4) bears a strong resemblance to the Nambu-Goto action for a classical relativistic 
string [2]: 5'NG[r(5,?)] = Jdadr (r • r')^ — {vf-iy'Y, where r in ^ng is now a four-vector and the dot product is the 
relativistic dot product. This action is physically interpreted as the (Lorentz Invariant) world-sheet area of the string. If 
eq. (4) could be mapped by suitable choice of gauge to the minimization of the Nambu-Goto action, one could exploit here 
the same reparameterization invariance that results in wave equation solutions to the equations of motion for the classical 
relativistic string, by choosing a parameterization such that r • r' = (for the purely geometrical problem, the discriminant 
under the square root in the action has opposite sign). Unfortunately however, because the velocity in the distance-metric 
action is a 3-velocity rather than a 4-velocity, our action only accumulates area when parts of the string move in 3-space, 
in contrast to the Nambu-Goto action which accumulates area even for a static string. The distance-metric action eq. (4) 
has a lower symmetry than that for the classical relativistic string. V* cannot depend on the time the transformation took, 
while the world sheet area does. Conversely, if we take e.g. configuration A at ? = to be a straight fine of length L, and 
configurations B at ? = T to be the same sttaight line but displaced along its own axis by varying amounts d, the geometrical 
area for all transformations would be IT, while the distances "D^ for each transformation would be Ld. 




(4) 
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mation A to B, the transformation must be subject to several auxiliary conditions. 

The first of these is non-extensibility. Points along the space curve cannot move independently of one 
another but are constrained to integrate to fixed length, so the curve cannot stretch or contract. Thus there 
is a Lagrange multiplier X{s,t) weighting the (non-holonomic) constraint: 



This constraint ensures a parameterization of the string with unit tangent vector t = r', so that the total 



a unit-speed curve. 

If the constraint (5) were not present in eq. (4), each point along the space-curve could follow a straight 
line path from A to 5 and the problem of minimizing the distance would be trivial. Equivalently, setting 
A = should reduce the problem to a sum of straight lines analogously to the zero-dimensional case 
above. 

As in the case of distance between points, one can fix the ^-parameterization from the outset by in- 
troducing a Lagrange multiplier a{t) that fixes the total distance covered per time J^dsV^ to a known 
function f{t). While this approach removes the infinite degeneracy mentioned above, as a global isoperi- 
metric condition it reduces the symmetry of the problem. For example there would then be no conser- 
vation law that could be written to capture the invariance of the effective Lagrangian with respect to the 
independent variable t. For these reasons we choose to leave the answer as unparamaterized with respect 
to t, analogous to the point-distance case above. 



There are many examples of nontrivial transformations between two strings A and B where chain non- 
crossing is unimportant (c.f. figures lA and IB). Here we derive the Euler-Lagrange equations for this 
case. 

From equations (4)-(5), the extrema of the distance V are found from 




(5) 



length of the string is L = J^ds Vv^ = J^ds. In the language of differential geometry, the space curve is 



A. Ideal chains 




(6) 
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Performing the variation gives 

5V ^ [ds [p, ■ Srfo + [ dt [p, • Srto 
Jo Jo 
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(7) 



where the generalized momenta and p^ are given by: 



p, = — = v and p, = — = -At (8) 

where v is again the unit velocity vector, and t is the unit tangent to the curve. 

The EL equation follows from the last term in (7), and yields a partial differential equation for the 
minimal transformation r* [s^ t) : 

(r^) r - (r • r)r = |r|^ (Ar" + AV) (9) 

where we have used the facts that |r'| = 1 and r' • r" = t • /t = 0, since the tangent is always orthogonal to 
the curvature at any given point along a space curve. 

Equation (9) can be written in terms easier to understand intuitively by using the unit velocity vector 
V, tangent t, and curvature k'}-***^ 

v = AK; + A't. (10) 

Comparison of equations (10) and (2) illustrates the point made earlier that setting the Lagrange mul- 
tiplier A corresponding to the non-extensibility condition to zero results in straight line solutions for 
all points along the space curve. Conversely the condition that the space curve form a contiguous ob- 
ject results generally in nonzero deviation from straight line motion. So in comparing various extremal 
solutions to eq. (10), the minimal solution will minimize | A| everywhere. 

The boundary conditions are obtained from the first two terms in (7). Since the initial and final config- 
urations are specified, the variation 5y vanishes at ? = 0, T, and the corresponding boundary conditions. 



[***] The invariance of the Lagrangian to (s,f ) leads to conservation laws by Noether's theorem [1], which here take the form of 
divergence conditions. However these generally contain no new information beyond the EL equations, and can be obtained 
by dotting eq. (10) with either r' to give A' = v • t, or r to give v • (At)' = 0. 
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or initial and final conditions, are: 

r*{s,0)^r^{s) and r*{s,T) ^r^{s) . (11) 

Since the end points of the string are free during the transformation, 5r at s — 0,L, and so the 
conjugate momenta must vanish: Pj(0,?) = Ps{L,t) — 0. This means that At = at the end points. 
However since t cannot be zero, the only way this can occur is for A(0,f) = A(L,r) ~ 0. The Lagrange 
multiplier, which represents the conjugate force or tension to ensure an inextensible chain, must vanish 
at the end points of the string. If A = 0, the EL equation (10) gives v = A't at the end points. However 
since v is a unit vector, v is orthogonal to v (or v), and we have finally the boundary conditions at the end 
points of the string: 

A'v • t = (at the end points). ( 1 2) 

Equation (12) has three possible solutions. One is that v • t = or equivalently r • r' = 0, which cor- 
responds to pure rotation of the end points. It is worth mentioning that the end points of the classical 
relativistic string also move transversely to the string. Moreover because of the Minkowski metric the 
end points must also move at the speed of light. Here however because Lorentz invariance is not at 

issue, additional solutions are possible. The end-points of our string can be at rest, v = 0, and satisfy the 
boundary condition (12). The last solution of eq. (12) is for A' = 0. Because A also vanishes at the end 
points, eq. (10) gives v = 0, or straight line motion. In summary the three possible boundary conditions 
for the string end points are: 

v-t = (pure rotation) (13a) 

V = (at rest) (13b) 

V = (straight line motion) (13c) 

Whether an extremal transformation is a minimum can be determined by examining the second varia- 
tion of the functional (6): 

S^V^l [ [ dsdt [(5r-I-5r+5r'-A-5r'] , (14) 
2 Jo Jo 

where I/y = (r^^,; — i;iy)/|rp and = —\{s,t)6ij, and ^r^ and Sr are the s and t derivatives of the 
variation 6r from the extremal path. 
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We now apply these concepts to some specific examples. 

B. Examples 

Translations. If two space curves differ by a translation, r^{s) — r^{s) +d with d a constant vector. 
The appropriate boundary condition for the end points is (13c). The points along the string can all 
satisfy (10) with v = and A = everywhere (since i, 0), and straight line motion results: r*{s,t) = 
+ {^sis) —rj^{s))t /T . The distance V* = L\d\. This is the 1-dimensional analogue to eq.s (2), (3). 

Piece-wise linear space curves. Suppose initially the curvature of some section of the string is zero. 
Then, taking the dot product of v with eq. (10), we see that eq. (12) holds for all points along the 
string. So the string either rotates or translates (or remains at rest if that segment has completed the 
transformation). 

Generally if one string partner has curvature (e.g. rA in fig. IB) the transformation is more complicated, 
but if both rA and rg are straight lines as in figure 1 A, equation (12) holds for both. It is then reasonable 
to seek solutions r* of the EL equation such that equation (12) holds for all (s,t). 

Consider the two space curves shown in figure 1 A with r/^{s) — sx and (s) —sy, both with curvature 
K — 0. We first investigate rotation from A to B. This transformation satisfies the EL equation so appears 
to be extremal: r = sf = s{cosu;tx+sinu;ty) . The velocity r = sLod, so the Distance Vir^aris, t)~\ — ttL^/ 4. 
Taking the dot product of t with eq. 10 gives A' = t • v = —cv, or X{s, t) = Xg — us. For the transformation 
to be extremal, the conjugate momenta must also vanish at the string end points, or A(0, t) = A(L, t) = 0. 
This is impossible to achieve with this functional form, so the transformation is not extremal. 

We may however include the subsidiary condition here that rj^{0,t) — r^{0,t). Then the end point of 
the string at s = is determined, and the variations Sr{0,t) must vanish. Now only X{L,t) — 0, and so 
A (5, t) —u;{L — s). The transformation is extremal. 

Whether it is a minimum can be determined by examining the second variation (14). For the trans- 
formation r^cni^j), the matrix I in (14) is non-negative definite, a necessary condition for a local min- 
imum [1], however A is negative definite, so the character of the extremum is determined by the in- 
terplay of the two terms in (14). Variations 5r that preserve r'^ — I or 2i- 5r' — are satisfied in 
this example by Sr — f{s,t)d, where f{s,t) must satisfy the boundary conditions 5r{0,t) — 5r{s,0) — 
Sr{s, T) — 0. We thus let the variations have the functional form: 5t — esin(^s) sm{n7rt/T)d, where 
6 — — sincjtx + cosujty, n is a positive integer, and k is unrestricted. Inserting this functional form for the 
variations into eq. (14) gives S^V = (e^7r/8)jF(/:L), where J-'{x) is a non-positive, monotonically decreas- 
ing function, with a maximum of zero at kL = 0. In fact to lowest order T{kL) ^ — (7re^/2160) [kL)^. 



8 



The extremum corresponding to pure rotation of curve into is a maximum! 

The only other solution to equations (10) and (12) for all {s,t) is for each point s on r/^{s) to be con- 
nected to a corresponding point on r^{s) by a straight line, corresponding to equation (13c). Equation (12) 
holds everywhere because X'{s,t) — 0. Because A is zero at the boundaries it is thus zero everywhere. 

An intermediate configuration then has the shape of a piecewise linear curve with a right angle 'kink' at 
s*{t) (see fig 2). As t progresses, the kink propagates along curve r^, and the horizontal part of the chain 
follows straight line diagonal motion, shrinking as its left end is overlaid onto curve rg. The solution for 
the velocity at all {s,t) is given by y{s,t) — Vo{t)Q [s — s*{t)) where s*{t) is the position of the tangent 
discontinuity in figure 2, which goes from s* (0) —0 to s*{T) — Last goes from to T. Cy is a unit vector 
along the direction of the velocity, = (— x + y)/ y/2, and Vo{t) is a speed which can be taken constant. 
By simple geometry, Vg — V^s*. Because s*{T) —L,Vo — VlL/T and s*{t) — Lt/T. The total distance 
travelled from equation (4) is then V* =L^ / \/2. 

Because the transformation involves straight line motion, it is minimal. This can be seen from the 
second variation eq. (14). The shape of the curve at all times is given by 

r*{s,t) = se{Lt/T-s)y+{Lt/T)Q{s-Lt/T)y 

+ {s-Lt/T)e{s-Lt/T)x (15) 

Taking variations from the extremal path as before, let 5r = €sink{s — Lt/T) s'm{mTt /T)Q{s — Lt /T)y. 
These variations only act on the "free" part of the string and preserve a unit tangent to first order. The 
matrix A in (14) is zero for straight line transformations where A = 0. The quadratic form (5r ■ I ■ (5r is 
non-negative, and results in a 2nd variation 5^V = (32^2)-'^ [{kLf + {nnfil — sine {kL))], which is 
non-negative, monotonically increasing in kL, and quadratic to lowest order, with a minimum of zero at 
kL — 0. The transformation is indeed minimal. 

Likewise, the minimal distance to fold a string of total length L upon itself starting from a straight line 
(to form a hairpin) is D* = L^/4. 

Solution Degeneracy. The above example illustrates that there are essentially an infinite number of 
extremal transformations: one can piece together various rotations and translations for parts or all of 
the chain while still satisfying the EL equations. This infinity of extrema is likely to lead to nearly 
insurmountable difficulties for the solution of eq. (9) by direct numerical integration. For these reasons 
we apply a method based on analytic geometry to obtain numerical solutions. This described in more 
detail below. 

There is also an infinite degeneracy of solutions having the minimal distance in the above example. 
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To see a second minimal transformation, imagine running the above solution backwards in time, so the 
kink propagates from s — L to s — along rg. But this solution should hold forwards in time for the 
original problem if we permute Tb and r^. Now intermediate states r* first run along x, then y. But 
then we can introduce multiple right angle kinks in various places, without causing the trajectories in the 
transformation to deviate from straight lines, so that intermediate states look like staircases. As there are 
an infinite number of possible staircases in the continuum limit, there is an infinite degeneracy. This can 
lead to a tangent vector r' whose magnitude is length-scale dependent, and less than unity until s ^ 0. 
For example an intermediate configuration can be drawn in figure 2 which appears as a straight diagonal 
line from r* (0, ? ) to r* (L, t) , until s — > when an infinite number of step discontinuities are revealed. This 
problem is resolved in practice through finite-size effects involving different critical angles of rotation 
described below. In the continuum limit it is resolved by introducing curvature constraints. 

Curvature constraints. In applications to polymer physics, chains have a stiffness characterized by 
bending potential in the analysis that is proportional to the square of the local curvature. Here we may 
choose to characterize stiffness by introducing a constraint on the configurations of the space curve, so 
that the curvature simply cannot exceed a given number: 

K(r")=e(|r"|<«:c). (16) 

This term lifts the infinite degeneracy mentioned above, as each near-kink (with putative k, > k,c) would 
result in slight deviations from linear motion in the above example, and thus an additional cost in the 
effective action. Other functional forms for are also possible. For some applications a more conven- 
tional stiffness potential of the form (r") = ^A^r"^ may be more appropriate. However then the action 
would no longer consist of a true distance functional, and its minimization would involve the detailed 
interplay of the parameter favouring globally minimal curvature with other factors affecting distance 
in the problem. 

Discrete Chains. Strings with a finite number of elements (chains) provide a more accurate represen- 
tation of real-world systems such as biopolymers. Discretization is also essential for numerical solutions 
in these more realistic cases. Monomers on a discretized chain travel along a curved metric [3], and 
Lagrange multipliers explicitly account for this fact here. 

We start by discretizing the string into a chain of links each with length ds — L/N, so that equation (4) 
becomes (ds) J dt Ylf=i^ V^' ^^^h each r,(?) a function of t only. The total distance is the accumulated 
distance of all the points joining the links, plus that of the end points, all times ds. This approach is 
essentially the method of lines for solving equation 10: the PDF becomes a set of + 1 coupled ODFs. 
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Equation (5) becomes constraint equations added to the effective Lagrangian: 

A;,;+i (r,-+i — r,-)2. We rewrite this strictly for convenience as J^^^^ ^h-i/j' where 
r,-+i/,- = r,-+i - r,-, and |r;+i/,-| = L/AT. 
The PDE in (10) then becomes 1 coupled (vector) ODEs, each of the form 

V,- + A,_i,,- r,y,_i - A,-,,+i r,+i /, = (17) 

with Ao.i = Aa?+i jv+2 = 0. Equation (17) is consistent with (10) after suitable definitions, for example the 
curvature at point i after discretization is given by (r,+i/,- — Xiji^x) j ds^ . 

One link. We turn to the simplest problem of one link with end points A and B (see fig. 3), for which 
the action reads L J^dt{^/^+ ^/rf — ^rg,^). Points A and B have boundary conditions rA(0) = A, 
rB(0) = B, rj^{T) — A', rB(r) = B'. The link in our problem is taken to have a direction, so point A 
cannot transform to point B. The Euler-Lagrange equations become: 

Va-ATb/a^O AVa-Tb/a^O 

or (18) 
VB + ArB/A = AvB-rB/A = 

where the orthogonality of v and v has been used. 

Reminiscent of eq. (12), equations (18) each have 3 solutions. For point A these are: (1) Va • Tb/a = 0, 
or pure rotation of A about B, (2) Va or point A is stationary, or (3) A = and thus Va = from the 
EL equations, indicating straight-line motion. Moreover, (1) implies Vb = 0, or both points rotate about 
a common center, (2) implies Vb • Tb/a = or 5 rotates, and (3) implies Vb = as well, so that both points 
move in straight lines. An extremal transformation thus involves either straight line motion, or rotations 
of one point about the other at rest (or common center). Once again, there are an infinite number of 
solutions: any combination of translations and rotations satisfies the EL equations, such as those shown 
in figure 3B-F. 

The Lagrange multiplier may be found from the first integral: taking the dot product of the EL equation 
for B with Tb/a gives —ds^X — Tb/a • Vb. Thus when B moves in a straight line A = 0. When B rotates about 
A, its acceleration Bb follows from rigid body kinematics as a^ + ax Tb/a — Ci;^rB/A, where uj and a are 
the angular velocity and acceleration respectively, and Ha = 0. Thus A = 1/L. 

The minimal solution is the one that involves the minimal amount of rotation (and monotonic approach 
toA'B'). This may be obtained from analytic geometry: for the example configurations in fig. 3F, points 
rotates about point A until B", where the straight line B"B' is tangent to the circle of radius ds — L about A. 
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The distance (over ds) is AA' + 19, + B"B', where sin^^ = L/{L+AA') and B"B' = s/iAA')^ + 2L{AA'), 
so for example if AA' = 2L, P S.ieSL^. 

Chains with curvature. We can now investigate the transformation shown in figure IB with the above 
methods. This is the canonical example when at least one of the space curves has non-zero curvature 
K. Let rA = Rsm{ns /2L)k + Rcos{7cs /2L)y and Tb = sx + Ry, with < s < L and R = IL/n. We then 
discretize the chain into segments. According to eq. (17), the end point velocities Vi, Yn+i obey 
EL equations of the same form as equations (18), and thus either rotate or translate. The situation for 
these links is analogous to figures 3B and 3F, in that the angle the link must rotate depends on the order 
of translation and rotation. The geometry in figure IB is analogous to transformations A'B' — > AB in 
figures 3B, 3F, in that the critical angle 9c the link must rotate before translating is smaller if translation 
occurs first. 

Figure 4 shows the two minimal solutions thus obtained. The transformation in fig. 4A undergoes 
translation away from curve r^, and rotation at r^. It is the global minimum. The transformation in 4B 
rotates from rA through a larger critical angle (see 4B inset), and then translates to r^. Both solutions 
have a soliton-like kink that propagates across either space-curve or rA. 

The minimal transformation follows these steps: (1) Link r2/i rotates about ri, vi = 0, vi • ri/i = 0, and 
the Lagrange multiplier representing the conjugate 'force' A12 7^ 0. During this rotation, nodes 3,4, .. . 
move in straight lines formed by their initial values Tas, rA4, . . . and the tangent points to circles of radius 

ds centered at rB2, rgs, The corresponding Lagrange constraint forces A23, A34, . . . are all zero. Links 

r3/2,r4/3, • • • all adjust their orientation to ensure straight-line motion of their end points (dashed lines in 
fig. 4A), except for r2 which follows a curved path. (2) When link r2/i completes its rotation, it coincides 
with curve rg, and the process starts again with link r3/2 which begins its rotation about r2, while nodes 
4, 5, . . . move in straight lines. This process continues until the final link Tn+i/n rotates into place on rg. 
The transformation in 4B is essentially the time-reverse of the above, but starting at curve rg and ending 
onrA. 

For ideal chains without curvature constraints, the distances obtained from the two transforma- 
tions in 4A,B differ non-extensively as the number of links N ^ 00. Moreover, the distance for 
each transformation itself differs non-extensively from the Mean Root Square distance MRSD — 
Yl^=\ V (^a; — Tb;)^ as — > 00. t^^^^ Specifically, the distance travelled by straight line motion scales 
as dsNL ~ L^, while the distance travelled by rotational motion scales as ds {N9cds) ~ Li^/N. 



[f t f] The MRSD is always less than or equal to the Root Mean Square Deviation or RMSD between structures, as can be shown 
by applying Holder's inequahty. 
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On the other hand, curvature constraints as in eq. (16) become more severe on consecutive links as 
— > oo, and can yield extensive corrections to the distance. Specifically, the increase in distance AD 
due to curvature constraints scales like the radius of curvature R times A'^, since every node is affected by 
the rounded kink as it propagates. So AD ~ dsNR ~ LR. The importance of this effect then depends on 
how R compares to L (the ratio of the persistence length to the total length). It does not vanish as ^ oo. 
Non-crossing constraints described below also yield extensive corrections to the distance travelled. 

C. Non-crossing space curves 

The minimal transformation may be qualitatively different when chain crossing is explicitly disal- 
lowed. Figure IC illustrates a pair of curves that differ only by the order of chain crossing. They 
are displaced in the figure for easier visualization but should be imagined to overlap so the quantity 
/(j^ Ir^ — ^0, i.e. if they were ghost chains their distance would be nearly zero, and most existing 
metrics give zero distance between these curve pairs (see Table I). 

Analogous to the construction of Alexander polynomials for knots, if we form the orthogonal pro- 
jection of these space curves onto a plane there will be double points indicating one part of the curve 
crossing over or under another. To transform from configuration to without crossing, the curves 
must always go through configurations having zero double points. If we trace the curve in an arbitrary but 
fixed direction, each double point occurs twice, once as underpass and once as an overpass. We may call 
the part of the curve between two consecutive passes a bridge. If the bridge ends in an overpass we assign 
it +1, if the bridge ends in an underpass we assign it -1, so traversing from the left in figure IC, curve 
has (+1) sense, and curve (-1). The change in sense during any transformation obeying non-crossing 
is always ±1, while ghost chains can have changes of ±2. 

The non-crossing condition means that the Lagrangian for the minimal transformation now depends 
on the position r{s, t) of the space curve, which may be accounted for using an Edwards potential: 
VNc([r(5',0]) = IodsiJ^^ds2 6{r{si,t) — r{s2j)) In practice a Gaussian may be used to approximate the 
delta function, with a variance that may be adjusted to account for the thickness or volume of the chain. 

The Euler-Lagrange equation now becomes 

(Kc)r = {Cr')s+{Cr)t - [(K)r"].. (19) 

where the curvature potential in eq. (16) has been included, and the notation {jCr')s = {d/ds){dC/dr') 
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has been used. Equation (10) is now modified to 

V, = (At)^ + VKc + m)r"]ss (20) 

To access various conformations, the minimal transformation must now abide by the non-trivial ge- 
ometrical constraints that are induced by non-crossing. In general this renders the problem difficult, 
however the example in figure IC is simple enough to propose a mechanism for the minimal transfor- 
mation consistent with the developments above, without explicitly solving the EL equations in this case. 
In analogy with the hairpin transformation described below eq. (15), the transformation here involves 
essentially forming and then unforming a hairpin. rA(A'^) (the blue end of curve Ta in fig IC) propagates 
back along its own length until it reaches the junction, where it then rotates over it to become the overpass 
(this takes essentially zero distance in the continuum limit). The curve then doubles back following its 
path in reverse to its starting point. This transformation is fully consistent with the allowed extremal rota- 
tions and translations of the discretized chain. The distance in the continuum limit is I> = J^ds {2s) = f, 
where £ is the length of the shorter arm extending from the junction in fig IC. 

m. DISCUSSION 

The distance between finite objects of any dimension <i is a variational problem, and may be calculated 
by minimizing a vector functional of <i + 1 independent variables. Here we formulated the problem for 
space curves, where the function r* (s, t) defining the transformation from curve Ta to curve gives the 
minimal distance V. 

We provided a general recipe for the solution to the problem through the calculus of variations. For 
simple cases the solution is analytically tractable. Generally there are an infinity of extrema, and direct 
numerical methods are unlikely to be fruitful. We employed a method that interpreted the discretized EL 
equations geometrically to obtain minimal solutions. The various solutions obtained here are summarized 
in Table I, and compared with other similarity measures currently used. 

The distance metric may be generalized to higher dimensional manifolds, for example a two dimen- 
sional surface needs three independent parameters to describe the transformation. The distance becomes 
V — Jdujdv Jdt\r\ and the constant unit area condition becomes 1 1^ x |^ | = 1- 

The question of a distance metric between configurations of a biopolymer has occupied the minds 
of many in the protein folding community for some time (c.f. for example [4-8]). Such a metric is 
of interest for comparison between folded structures, as well as to quantify how close an unfolded or 
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partly folded structure is to the native. Chan and Dill [5] investigated the minimum number of moves 
necessary to transform one lattice structure to another, in particular while breaking the smallest number 
of hydrogen bonds. Leopold et al [4] investigated the minimum number of monomers that had to be 
moved to transform one compact conformation to another. Falicov and Cohen investigated structural 
comparison by rotation and translation until the minimal area surface by triangulation was obtained 
between two potentially dissimilar protein structures [6]. 

The present theoretical framework allows computation of a minimal distance between proteins of the 
same length by rotating and translating until V is minimized, as done in the calculation of RMSD. 
Comparison between different length proteins would involve the further optimization with respect to 
insertion or deletion of protein chain segments. 

It is interesting to ask which folded structures have the largest, or smallest average distance {V) from 
an ensemble of random coil structures, and also whether the accessibility of these structures in terms of 
V translates to their folding rates. It can also be determined whether the distance to a structure correlates 
with kinetic proximity in terms of its probability to fold before unfolding [7], by calculating {Vp-p). 
The question of the most accessible or least accessible structure may be formulated variationally as a 
free-boundary or variable end-point problem. 

It is an important future question to address whether the entropy of paths to a particular structure is 
as important as the minimal distance. In this sense it may be the finite "temperature" {(3 < oo) partition 
functionZ(/9) = j <i[r(i',?)]exp (— /9I?[r(i',?)]), i.e. the sum over paths weightedby their 'actions', which 
is the most important quantity in determining the accessibility between structures. This has an analogue 
to the quantum string: we investigated only Z(oo) here. We hope that this work proves useful in laying 
the foundations for unambiguously defining distance between biomolecular structures in particular and 
high-dimensional objects in general. 
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FIGURE CAPTIONS 

FIGURE 1 : Three representative pairs of curves. A Straight line curve rotated hyn/l. B One string has 
a finite radius of curvature, the other is straight. C A canonical example where non-crossing is important- 
the curves are displaced for easy visualization but should be imagined to be superimposed. 

FIGURE 2: The minimal transformation from A to 5 in figure 1 A involves the propagation of a kink 
along curve B. The end point of the curve at intermediate states satisfies x + y = L, the equation for a 
straight line. A similar linear equation holds for any point on the curve, thus no solution with shorter dis- 
tance can exist. An intermediate configuration is shown in red. Alternative transformations are possible 
with kinks along A, as well as multiple kinks (see text). 

FIGURE 3: Transformations between two rigid rods. (A) undergoes simultaneous translation and 
rotation and so is not extremal. (B) is extremal and minimal. The rod cannot rotate any less given that it 
translates first. However this transformation is a weak or local minimum. (C), (D), and (E) are extremal 
but not minimal. (F) Is the global minimum. It rotates the minimal amount, and both A and B move 
monotonically towards A', B'. A purely straight-line transformation exists but involves moving point A 
away from A' before moving towards it (similar to (D)), thus covering a larger distance than the minimal 
transformation. 

FIGURE 4: Two minimal transformations between the curves shown in fig. IB, for N —\Q links. Fig 
(A) is the global minimal transformation Y*{s^t), with V* ^ 0.330L^, figure (B) is a local minimum with 
V 0.335L^. In (A), links with one end touching curve Tb rotate, the others translate first from Ta, 
rotating only when one end of a link has touched r,,. In (B) they rotate first from r^, then translate into 
Tij. Dashed lines in (A) show the paths travelled for each bead. The inset of (A) plots the total distance 
travelled as a function of the number of links A'^, with various N plotted as filled circles to indicate the 
rapid decrease and asymptotic limit to V^o ~ 0.25 IL^ The inset in (B) shows the minimal angle each 
link must rotate during the transformation- it is less for the transformation in (A). Movie animations of 
these transformations are provided as Supporting Information. 
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TABLES AND TABLE CAPTIONS 



TABLE I: Values of the distance for various examples considered here, compared to other metrics. 



Curve Pair 


V* (L2) 


RMSD* (L) 


(1-Q)^ 


x" 


Trivial translation 


\d\/L 


\d\/L 








"L-curves", fig lA 


I/V2 




_t 





Straight line to Hairpin 


1/4 


1/V6 


1 


1/2 


"C-curve"- st. line, fig 4A 


0.330 


0.371 


_t 


0.417 


"C-curve"- st. line, fig lA^ 


0.251 


0.334 


_t 


1 


"Over/under" curves, fig IC 


m? 


fsO 


0' 





Single link, fig 3F^ 


5.168 




_s 


_s 



* RMSD= ^N^^Y^ii^Ai-rBi)^ ^ Fraction of shared contacts A has with B, 
see [7, 8] for definitions. 

Structural overlap function equal to 1 minus the fraction of residue pairs 
with similar distances in structures A and B. The formula in ref. |9| is used. 

i.e. In the continuum limit. For AA' = 2 x link length. 0/0 or undefined 
' Assuming a contact is made at the junction. ^ Undefined for a single link 

is larger than the RMSD here because RMSD contains a factor of 2 while V 
did not. We could have computed the "effective distance" for the rod by dividing 
by 2. 




FIG. 2: 
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FIG. 4: 



